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the Institute for Democracy and Higher Education (IDHE) works to ad- 

vance learning that strengthens democracy and advances social and politi- 
cal equity. IDHE’s signature initiative, the National Study of Learning, Voting, 
and Engagement (NSLVE), was established in 2013 as both a service to higher 
education - providing participating colleges and universities with their students’ 
registration and voting rates - and a significant database for research. The NSLVE 
database is created by matching publicly available local and state voting records 
with enrollment lists from colleges and universities nationally. Institutions must 
opt in, and at the time of this writing, the study includes more than 1,000 Title IV, 
degree-granting colleges and universities. These institutions represent various 
types (e.g., community colleges, liberal arts colleges), missions (e.g., religiously 
affiliated institutions), student populations (e.g., full-time, part-time), and geo- 
graphic locations. All 50 states are represented in the database. The database con- 
sists of enrollment records for approximately 10 million students from the relevant 
fall semester and voting records from the 2012 and 2016 presidential elections and 
the 2014 midterm election. The NSLVE data consists of de-identified student re- 
cords and contains no names or other information that would allow researchers 
to identify an individual student. 
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In this report, we describe the systematic process of creating and maintaining 
the NSLVE database. The process includes: 1) recruiting college and university 
campuses to obtain permission to use student data, 2) partnering with the Na- 
tional Student Clearinghouse (“the Clearinghouse”) to obtain student enrollment 
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records, 3) purchasing publicly available voting records from an organization called 
Catalist, and 4) working with the Clearinghouse to merge and de-identify student en- 
rollment and voting records, and 5) calculating institutional voting rates. 


1. Recruiting Colleges and Universities 


Participation in NSLVE is not automatic, and colleges and universities must opt 
into the study by signing an authorization form allowing their enrollment records to be 
used for NSLVE specifically. The students in the NSLVE database are those who were on 
enrollment lists of participating institutions on a date closest to but before the Novem- 
ber elections in 2012, 2014, and 2016. 

To participate in the study, institutions must be degree-granting, not-for-profit pub- 
lic and private institutions in the U.S., and they must provide enrollment records to the 
Clearinghouse. To recruit campuses, we formed partnerships with existing associations 
and consortia in higher education, such as the American Association of State Colleges 
and Universities (AASCU), the Association for Institutional Research (AIR), Campus 
Compact, and the Higher Education Data Sharing Consortium (HEDS). We sought a 
range of partnerships to reach institutions of all Carnegie Classifications (state colleges, 
community colleges) and individuals on campus who might be interested in the data 
(institutional researchers and civic engagement offices). We asked these partners to dis- 
seminate information about the study to their members and networks. For many part- 
ners, we hosted informational webinars and teleconferences to answer questions about 
the study. We also created a “Frequently Asked Questions” about NSLVE, http://ac- 
tivecitizen.tufts.edu/nslve-faq/, as well as a separate FAQ on student privacy protections, 
http://activecitizen.tufts.edu/wp-content/uploads/NSLVE-FERPA-FAQ. pdf. 

Participation in the study is free, and each participating institution receives a tai- 
lored report containing that institution’s student voter registration and voting rates. 
One recruiting goal is for the study to obtain a representative sample of colleges and 
universities in the U.S. 


2. Partnering with the National Student Clearinghouse 


Founded in 1993, the Clearinghouse was established to streamline the process of 
allowing lenders to confirm that students with loans were still in school and therefore 
eligible for deferment of repayment. Over time, the Clearinghouse’s role expanded to 
include degree verification, research, and other services. Currently a vast majority of 
US. colleges and universities participate in the Clearinghouse, and students at these 
institutions represent over 98% of students enrolled at private and public U.S. institu- 
tions. Students not in the NSLVE database are usually those who have exercised their 
right pursuant to the Family Education Rights to Privacy Act (FERPA) to deny the in- 
stitution the right to use their information for any purpose, including for institutional 
research. 


To participate in the Clearinghouse, institutions must provide specific information 
on each student, including name, date of birth, last known permanent address, and 
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enrollment status. Institutions have the option of providing information such as major 
field of study (e.g., physics, business), class level (e.g., sophomore, graduate student), 
race/ethnicity, gender, and whether the student is seeking a degree. In 2012, the Clear- 
inghouse started using official Classification of Instructional Programs (CIP) codes to 
capture a student’s field of study (NCES, 2010). Some types of institutions, such as two- 
year private and for-profit institutions, are less likely than others to provide data to the 
Clearinghouse. Dynarski, Hemelt, and Hyman (2015) find that enrollment coverage is 
highest among public institutions and lowest among for-profit colleges. In addition, not 
all colleges and universities report student characteristics such as race/ethnicity and 
gender. Dynarski et al. (2015) found that enrollment coverage is lower for minorities but 
similar for males and females. Coverage also varies in the NSLVE dataset by field of 
study, enrollment status (full-time or part-time), and degree seeking status. 


Use of the Clearinghouse data by policy makers, educators, and academic research- 
ers is relatively new, and because the Clearinghouse data is at the student level, it offers 
a significant contribution to education research. Researchers have previously used 
Clearinghouse data to explore the effects of particular programs or policies on postsec- 
ondary attendance, persistence, and attainment (Dynarski, Hemelt, & Hyman, 2013; 
Dynarski, Hyman, & Schanzenbach, 2013; Hyman, 2013). For many years, the U.S. De- 
partment of Education has been seeking legislative permission to augment the existing 
Integrated Postsecondary Education Data System (IPEDS), which collects institu- 
tion-level data, with student-level data to enable research using individual student 
records. 


Table 1 provides information on variables from the Clearinghouse that are included 
in the NSLVE dataset. About one-third of NSLVE institutions provide race/ethnicity 
information for their students. Race/ethnicity is particularly important to NSLVE be- 
cause, in completing that data field, institutions are given the opportunity to identify 
nonresident aliens, generally international students who are non-citizens. Also, the 
Clearinghouse recently began requiring colleges and universities to identify each stu- 
dent as either a graduate or undergraduate student. Previously, to distinguish between 
graduate and undergraduate students, we relied on “class level” information, which was 
not reported for all students. The NSLVE database contains no student names or data 
that would allow researchers to identify any student. 


Table 1. NSLVE Variables from National Student Clearinghouse 


Variable Name Description 
OPE ID Office of Postsecondary Education ID. 
; Generated ID number identifying each record in the file 
Unique ID : 
(specific to student/campus). 
NSC link Generated ID number, identifying each student in the file 


(can be used to match students across campuses). 


Birth year Year of the student’s birth. 
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Variable Name 


Age at election 


Description 


Age at election is computed from the campus-provided 
birthdate. 


Campus city 


City where the campus is located. 


Campus state 
Home zip code 


State where the campus is located. 
Earliest known zip code for the student. 


Class 


Class level for the student at the time of the election. Options 
include: Associate’s, Bachelor’s, Freshman, Sophomore, Junior, 
Senior, Undergraduate Certificate, Unspecified undergradu- 
ate, Master’s, Doctoral, Post-doctorate, First professional, 
Unspecified graduate/sessional, Post baccalaureate certificate. 


Program level+ 


Program level for the student at the time of the election. If 
more than one was provided, the highest level was selected. 
Options include: Undergraduate Certificate, Associate's 
Degree, Bachelor’s Degree, Post Baccalaureate Certificate, 
Master’s Degree, Doctoral Degree, First Professional Degree, 
Graduate/Professional Certificate, Non Credential Program 
(Preparatory Coursework/Teacher Certification). 


Major 


Major for the student at the time of the election. 


CIP code 


Classification of Instructional Program (CIP) code at the time 
of the Election. This may not be associated with the major 
reported in analysis file. If no CIP was available from the 
institution, NSC imputed the code from the free-text major 
field. 


CIP description 


NCES description of the CIP code. 


CIP family 


NCES program area to which the CIP code belongs. 


CIP imputed flag 


Y = The CIP code was imputed from the free-text major. 
Null = The CIP code is as provided by the participating 
campus. 


Race* 


Race for the student at the time of the election. Options 
include Nonresident alien, Asian, Black, American Indian/ 
Alaskan Native, Asian/Pacific Islander, Hispanic, Native 
Hawaiian or Other Pacific Islander, White, Two or More 
Race/Ethnicity Categories, Race/Ethnicity Unknown. Only 
available for campuses that signed non-directory form. 


Gender* 


Gender for the student at the time of the Election. Options 
include Male, Female. Only available for campuses that 
signed non-directory form. 


Enrollment status* 


Enrollment status for the student at the time of the Election. 
Options include Full-time, Part-time (Quarter-time, half- 
time, or less-than-halftime status). Only available for campus- 
es that signed non-directory form. 
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Variable Name Description 


Degree-seeking indicator for the student at the time of the 
Election. Options include: Student is seeking a degree, Stu- 
dent is not seeking a degree. Only available for campuses that 
signed non-directory form. 


Degree seeking* 


Eligibility based on age. Options include: 
Yes = campus-provided birth date is before or equal to 
Age flag November 6, 1994 or November 4, 1996, No= campus-provid- 
ed birth date is after November 6, 1994 or November 4, 1996, 
U= birth date unavailable. 
Eligibility based on Social Security Number (SSN). Options 
SSN flag include Yes = campus provided a valid SSN for this student, 
N= campus did not provide a valid SSN for this student. 


* Provided for only institutions that sign an authorization form allowing permission 
for NSC to share this “non-directory” data element 


+Only available in 2014 and 2016 


3. Purchasing Voting Records from Catalist 


The NSLVE database uses publicly available state and local voting records collected 
by Catalist, an organization that collects, cleans, and updates voter files of more than 
180 million registered voters in all 50 states and the District of Columbia. Whether a 
person registered to vote and voted (not for whom they voted) are matters of public re- 
cord, but because voting records are inconsistently maintained by states and municipal- 
ities, they can be challenging to track down. Catalist sells subscriptions to organizations 
interested in using the database to conduct research and is widely respected and used by 
academic researchers (see Ashok, Feder, McGrath, & Hersh, 2014; Ansolabehere & 
Hersh, 2012, 2013). While Catalist does its own modeling for clients, we use only the 
publicly available voting records the organization collects. 


Pursuant to a contract with Tufts University, Catalist creates a snapshot of the com- 
plete set of voting records for each federal election. This snapshot is preserved for use as 
more colleges and universities join the study. Snapshots are created once all of the vot- 
ing records have been compiled after an election (to date, April 2013, April 2015, and 
June 2017). Catalist has created three snapshots, one for each of the 2012, 2014, and 2016 
elections. These snapshots are solely for the purpose of this study and are not accessible 
to others. Table 2 lists NSLVE variables that come from Catalist. 
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Table 2. NSLVE Variables from Catalist 


Variable Name Description 


Match confidence score indicating the level of 
confidence that student in Clearinghouse database 


Match confidence ‘ i : 
was correctly matched to voting records in Catalist 
database. 

Age at election is computed from the campus-pro- 

Election day age i : 

vag vided birthdate. 
Gender represented as male, female, or unknown. 

Gender 
From voter file. 


Possible values: Black, Caucasian, Asian, Native 
American, Middle Eastern, Hispanic, Jewish, Other, 

Race Unknown. May come from voter file or a Catal- 
ist-generated algorithm predicting race based on 
commercial and other data sources. 


Possible values: Possibly, Likely, Highly Likely, 
Race confidence Uncoded. Level of confidence that Catalist’s 
algorithm predicted the correct race/ethnicity. 


Ethnic subcategory breakdown. This is always 
Ethnicity supplied by the commercial vendor and is provided 
as a self-explanatory decoded string. 


The date the student registered to vote on the 
Registration date current file. Always comes from the voter file, but 
may have 00 for day and/or month. 


Options: ‘active’ = active voter registration, ‘inac- 
tive’ = only set for people on the current voter file 
with an inactive value set in that file, ‘dropped’ = 
individual has vote history and appeared on past 
files but does not appear in the most recent voter 
file, ‘multiple appearances’ = person with a mailing 

Voter status address in that state, who was sourced from 
another state file, and who is not found as re-regis- 
tered for that state, ‘unregistered’ = voter is not on 
the current or past voter files but is known to reside 
in the state, ‘unmatched member’ = record upload- 
ed by the client that did not match the existing 
Catalist database. 


Field is a subset of the CASS (Coding Accuracy 
Registration address/city Support System) address fields split into compo- 
nent parts. 


Registration address/state Valid postal abbreviation for state. 
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Variable Name 


Registration address/zip code 


Description 


5 or 9 digit zip when known. 


Field is a subset of the CASS parsed address fields 


Mailing address/city i 
split into component parts. 
Mailing address/state Valid postal abbreviation for state. 
Mailing address/zip 5 or 9 digit zip when known. 
Mailing address/state Valid postal abbreviation for state. 
County FIPS Valid 3-digit county FIPS (Federal Information 


Processing Standard) code. 


Method of voting - general 
election 


Possible values: absentee, early vote, mail, polling, 
unknown 


Precinct code 


Definition, usage, and fill rate may vary by state. 


Ward 


Appended snapshot. Definition, usage, and fill rate 


may vary by state. 


Current congressional district for voter, as supplied 
by voter file. When not present on the voter file, 
this may be imputed from the registration address. 


Congressional district 


Current state house district for voter, as supplied by 
voter file. When not present on the voter file, this 
may be imputed from the registration address. 


House district 


Current state senate district for voter, as supplied by 
voter file. When not present on the voter file, this 
may be imputed from the registration address. 


Senate district 


City council Definition, usage, and fill rate may vary by state. 


4. Merging and De-identifying Student Enrollment and Voting 
Records 


The Clearinghouse performs the task of running the algorithm created by Catalist 
to match enrollment and voting records. As noted by Berent, Krosnick, and Lupia (2011), 
because private firms such as Catalist make earnings from proprietary models, there is 
potentially less transparency in the validation process when working with an outside 
company. For this reason, we consulted with Catalist on several occasions and turned 
to Ansolabehere and Hersh (2012) to better understand the method and quality of 
Catalist’s matching procedure. Catalist updates their registration records several times 
a year from each jurisdiction, which improves the accuracy of the data. Catalist con- 
tracts with and collects information from credit card companies, consumer surveys, 
and government sources to improve its matching capability. Catalist also de-duplicates 
records by linking records of the same person listed in a state’s voter file more than once 
and runs all records through the Post Office’s National Change of Address Registry to 
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identify movers. This information is helpful for understanding, for instance, whether 
college students register to vote after they have moved to a college campus or whether 
they remain registered at their home address. 


To identify a student in the snapshot of voting records, the Clearinghouse uses the 
student’s name, date of birth, earliest known home address, and campus address. Catal- 
ist collects a broad range of information on each of the individuals in its database. Re- 
cords submitted for matching against the Catalist national database are examined 
through a sophisticated set of proprietary algorithms that utilize statistics, fuzzy logic, 
and matching learning to determine the highest probability matches. (For specific in- 
formation about Catalist’s matching procedures, see Ansolabehere and Hersh, 2012, pp. 
443-445.) 


Once the two sets of records (Clearinghouse and Catalist) have been merged, each 
individual record is accompanied by a confidence score reflecting the similarity of the 
submitted record to the returned match record given all the possible combinations con- 
sidered by the matching algorithms. The average confidence rating for students in the 
NSLVE database is 96%. 

The process fully protects student privacy. After the matching process is complete, 
the Clearinghouse removes all student-level identifying information and sends the an- 
onymized data file to IDHE researchers. All student records are linked to a college or 
university’s OPE ID number, an identification assigned by the U.S. Department of Edu- 
cation’s Office of Postsecondary Education (OPE) and the privacy rights of students 
under FERPA are fully protected in this process. At no time does NSLVE receive the 
names or addresses of individual students. The only entity that knows the identity of 
individual students is the Clearinghouse, which already knows the students’ identities. 
Catalist personnel do not have access to the merged data, and the company may not and 
does not collect or store the data elements needed for the matching process. Any infor- 
mation that might allow the re-identification of an individual student (e.g., fewer than 
ten students in a particular field of study or demographic group) is replaced by the 
Clearinghouse with an asterisk. Date of birth is replaced with age on the date of election. 
The records are stored in a secure research drive at Tufts University. 


5. Calculating the Institutional Voting Rate 


The combined enrollment and voting records are then aggregated to the institu- 
tion-level and merged with selected data from the U.S. Department of Education’s 
IPEDS. IPEDS consists of nine interrelated survey components that are collected over 
three collection periods (fall, winter, and spring), either annually or semi-annually de- 
pending on the survey component. The completion of IPEDS surveys is mandatory for 
all institutions that participate in any federal financial assistance program authorized 
by Title IV of the Higher Education Act of 1965. The IPEDS system collects a variety of 
institutional characteristics, including type of institution, size, enrollment, financial aid, 
and other characteristics. Data collected for the 2012, 2014, and 2016 fall semesters were 
used in this database. 
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We use IPEDS to account for nonresident aliens (NRAs) at an institution. We first 
calculate the percentage of NRAs at an institution using IPEDS data. Next, we use this 
percentage of NRAs from IPEDS and multiply it by the Clearinghouse’s number of 
students at an institution. Using this method, we generate an estimated number of non- 
resident aliens at an institution. This number is used in our calculations of an institu- 
tion’s voting rate. To calculate an institution’s voting rate, we first calculate the number 
of eligible voters at an institution (number of respondents minus students less than 18 
minus number of NRAs at an institution). We then divide the number of students who 
voted by this number of eligible voters. 


Conclusion 


Currently, the NSLVE database contains approximately 29 million records for stu- 
dents from over 1,000 institutions for the 2012, 2014, and 2016 U.S. national elections. 
NSLVE contains about 9 million records per election, and we are already preparing for 
the 2018 election and beyond. We continue to recruit new colleges and universities into 
the study. If your campus is interested in joining, please email idhe@tufts.edu to learn 
more about the study. The NSLVE database allows us to not only research college stu- 
dent political engagement, but also to offer data-driven recommendations to institu- 
tions for supporting political involvement among low-propensity voters in order to 
increase the political mobility of these students. We provide general resources on our 
website as well as customized services to campuses interested in shaping practices and 
campus culture to foster active participation in our democracy. Please visit http://ac- 
tivecitizen.tufts.edu/idhe to learn more and to join our email list. 
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